Efficient Multi-label Classification with Many Labels
نویسندگان
چکیده
In multi-label classification, each sample can be associated with a set of class labels. When the number of labels grows to the hundreds or even thousands, existing multi-label classification methods often become computationally inefficient. In recent years, a number of remedies have been proposed. However, they are based either on simple dimension reduction techniques or involve expensive optimization problems. In this paper, we address this problem by selecting a small subset of class labels that can approximately span the original label space. This is performed by an efficient randomized sampling procedure where the sampling probability of each class label reflects its importance among all the labels. Experiments on a number of realworld multi-label data sets with many labels demonstrate the appealing performance and efficiency of the proposed algorithm.
منابع مشابه
Exploiting Associations between Class Labels in Multi-label Classification
Multi-label classification has many applications in the text categorization, biology and medical diagnosis, in which multiple class labels can be assigned to each training instance simultaneously. As it is often the case that there are relationships between the labels, extracting the existing relationships between the labels and taking advantage of them during the training or prediction phases ...
متن کاملEfficient Methods for Multi-label Classification
As a generalized form of multi-class classification, multilabel classification allows each sample to be associated with multiple labels. This task becomes challenging when the number of labels bulks up, which demands a high efficiency. Many approaches have been proposed to address this problem, among which one of the main ideas is to select a subset of labels which can approximately span the or...
متن کاملAn Efficient Large-scale Semi-supervised Multi-label Classifier Capable of Handling Missing labels
Multi-label classification has received considerable interest in recent years. Multi-label classifiers have to address many problems including: handling large-scale datasets with many instances and a large set of labels, compensating missing label assignments in the training set, considering correlations between labels, as well as exploiting unlabeled data to improve prediction performance. To ...
متن کاملMulti-label Classification with Principle Label Space Transformation
We propose a novel hypercube view that perceives the label space of multi-label classification problems geometrically. The view allows us to not only unify many existing multilabel classification approaches, but also design a novel algorithm, Principle Label Space Transformation (PLST), which seeks important correlations between labels before learning. The simple and efficient PLST relies on on...
متن کاملA New Kernel-Based Classification Algorithm for Multi-label Datasets
With the emergence of rich online content, efficient information retrieval systems are required. Application content includes rich text, speech, still images and videos. This content, either stored or queried, can be assigned to many classes or labels at the same time. This calls for the use of multi-label classification techniques. In this paper, a new kernel-basedmulti-label classification al...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013